Goto

Collaborating Authors

 hidden cost


The Hidden Costs of AI: A Review of Energy, E-Waste, and Inequality in Model Development

Winsta, Jenis

arXiv.org Artificial Intelligence

Artificial intelligence (AI) has made remarkable progress in recent years, yet its rapid expansion brings overlooked environmental and ethical challenges. This review explores four critical areas where AI's impact extends beyond performance: energy consumption, electronic waste (e-waste), inequality in compute access, and the hidden energy burden of cybersecurity systems. Drawing from recent studies and institutional reports, the paper highlights systemic issues such as high emissions from model training, rising hardware turnover, global infrastructure disparities, and the energy demands of securing AI. By connecting these concerns, the review contributes to Responsible AI discourse by identifying key research gaps and advocating for sustainable, transparent, and equitable development practices. Ultimately, it argues that AI's progress must align with ethical responsibility and environmental stewardship to ensure a more inclusive and sustainable technological future.


Arbitrary Decisions are a Hidden Cost of Differentially Private Training

Kulynych, Bogdan, Hsu, Hsiang, Troncoso, Carmela, Calmon, Flavio P.

arXiv.org Artificial Intelligence

Mechanisms used in privacy-preserving machine learning often aim to guarantee differential privacy (DP) during model training. Practical DP-ensuring training methods use randomization when fitting model parameters to privacy-sensitive data (e.g., adding Gaussian noise to clipped gradients). We demonstrate that such randomization incurs predictive multiplicity: for a given input example, the output predicted by equally-private models depends on the randomness used in training. Thus, for a given input, the predicted output can vary drastically if a model is re-trained, even if the same training dataset is used. The predictive-multiplicity cost of DP training has not been studied, and is currently neither audited for nor communicated to model designers and stakeholders. We derive a bound on the number of re-trainings required to estimate predictive multiplicity reliably. We analyze--both theoretically and through extensive experiments--the predictive-multiplicity cost of three DP-ensuring algorithms: output perturbation, objective perturbation, and DP-SGD. We demonstrate that the degree of predictive multiplicity rises as the level of privacy increases, and is unevenly distributed across individuals and demographic groups in the data. Because randomness used to ensure DP during training explains predictions for some examples, our results highlight a fundamental challenge to the justifiability of decisions supported by differentially private models in high-stakes settings. We conclude that practitioners should audit the predictive multiplicity of their DP-ensuring algorithms before deploying them in applications of individual-level consequence.


Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Models

Li, Jingxuan, Huang, Rui, Li, Wei, Yao, Kai, Tan, Weiguo

arXiv.org Artificial Intelligence

Code completion is widely used by software developers to provide coding suggestions given a partially written code snippet. Apart from the traditional code completion methods, which only support single token completion at minimal positions, recent studies show the ability to provide longer code completion at more flexible positions. However, such frequently triggered and longer completion results reduce the overall precision as they generate more invalid results. Moreover, different studies are mostly incompatible with each other. Thus, it is vital to develop an ensemble framework that can combine results from multiple models to draw merits and offset defects of each model. This paper conducts a coding simulation to collect data from code context and different code completion models and then apply the data in two tasks. First, we introduce an acceptance model which can dynamically control whether to display completion results to the developer. It uses simulation features to predict whether correct results exist in the output of these models. Our best model reduces the percentage of false-positive completion from 55.09% to 17.44%. Second, we design a fusion ranking scheme that can automatically identify the priority of the completion results and reorder the candidates from multiple code completion models. This scheme is flexible in dealing with various models, regardless of the type or the length of their completion results. We integrate this ranking scheme with two frequency models and a GPT-2 styled language model, along with the acceptance model to yield 27.80% and 37.64% increase in TOP1 and TOP5 accuracy, respectively. In addition, we propose a new code completion evaluation metric, Benefit-Cost Ratio(BCR), taking into account the benefit of keystrokes saving and hidden cost of completion list browsing, which is closer to real coder experience scenario.


The Hidden Costs of Open-Source AI Solutions

#artificialintelligence

It's hard to imagine now, but a decade ago, open-source software--programs that allow users to modify the source code--was still on the fringes. Startups were starting to build on open source and open core, but few, if any, enterprises were. Looking back, we can now say that open-source models undoubtedly accelerated both the pace of innovation and the quality of traditional software development. Nowadays, most anyone who is trying to build a successful SaaS product typically leverages as much open-source code as possible. Given their success building open-source SaaS solutions, it makes sense that many enterprises would strongly consider building out their AI capabilities in house.


The Hidden Cost of Big-Ticket Text Analytics: Time

#artificialintelligence

The inspiration for this week's clip in our "Get the Job Done!" series is the big-ticket procurement and implementation process--and all of those folks whose opinions you don't need. We hear all the time from prospective clients who've found themselves bogged down in the painful, protracted process of getting buy-in for enterprise text analytics platforms that offer something for everyone and come with a six-figure price tag. Oftentimes, this procurement process involves people in the organization who have lots of opinions but no research expertise and who, in cases, won't even be using the purchase in question. Worse yet, after everyone has had his/her say and the purchase has finally gone through, the original intended user finds the whole initiative mired in a lengthy, complicated implementation! It's 2017 and the one thing no one can afford to waste is time.